Gap Filling as Exact Path Length Problem

نویسندگان

  • Leena Salmela
  • Kristoffer Sahlin
  • Veli Mäkinen
  • Alexandru I. Tomescu
چکیده

One of the last steps in a genome assembly project is filling the gaps between consecutive contigs in the scaffolds. This problem can be naturally stated as finding an s-t path in a directed graph whose sum of arc costs belongs to a given range (the estimate on the gap length). Here s and t are any two contigs flanking a gap. This problem is known to be NP-hard in general. Here we derive a simpler dynamic programming solution than already known, pseudo-polynomial in the maximum value of the input range. We implemented various practical optimizations to it, and compared our exact gap-filling solution experimentally to popular gap-filling tools. Summing over all the bacterial assemblies considered in our experiments, we can in total fill 76% more gaps than the best previous tool, and the gaps filled by our method span 136% more sequence. Furthermore, the error level of the newly introduced sequence is comparable to that of the previous tools. The experiments also show that our exact approach does not easily scale to larger genomes, where the problem is in general difficult for all tools.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Path Analysis of Yield Related Traits in Wheat Genotypes under Normal Irrigation and Drought Stress Conditions

For identification of correlations and relations among different traits in bread wheat, 30 genotypes were investigated based on a split plot experiment in the form of randomized complete block design with three replications under normal and moisture stress conditions during the 2016-2017 crop season. The results of the analysis of variance showed that the effect of genotypes was significant for...

متن کامل

مسیریابی حرکت روبات‌های ماشین‌واره با روش پیشروی سریع

The Robot Motion Planning (RMP) problem deals with finding a collision-free start-to-goal path for a robot navigating among workspace obstacles. Such a problem is also encountered in path planning of intelligent vehicles and Automatic Guided Vehicles (AGVs). In terms of kinematic constraints, the RMP problem can be categorized into two groups of Holonomic and Nonholonomic problems. In the first...

متن کامل

Spectral gap of the totally asymmetric exclusion process at arbitrary filling

We calculate the spectral gap of the Markov matrix of the totally asymmetric simple exclusion process (TASEP) on a ring of L sites with N particles. Our derivation is simple and self-contained and extends a previous calculation that was valid only for half-filling. We use a special property of the Bethe equations for TASEP to reformulate them as a one-body problem. Our method is closely related...

متن کامل

Shortest Path Problem with Gamma Probability Distribution Arc Length

We propose a dynamic program to find the shortest path in a network having gamma probability distributions as arc lengths. Two operators of sum and comparison need to be adapted for the proposed dynamic program. Convolution approach is used to sum two gamma probability distributions being employed in the dynamic program.

متن کامل

A Note on the Integrality Gap in the Nodal Interdiction Problem

In the maximum flow network interdiction problem, an attacker attempts to minimize the maximum flow by interdicting flow on the arcs of network. In this paper, our focus is on the nodal interdiction for network instead of the arc interdiction. Two path inequalities for the node-only interdiction problem are represented. It has been proved that the integrality gap of relaxation of the maximum fl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 23 5  شماره 

صفحات  -

تاریخ انتشار 2015